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MULTIDIMENSIONAL RATIONAL COVARIANCE EXTENSION 
WITH APPLICATIONS TO SPECTRAL ESTIMATION AND 
IMAGE COMPRESSION* 

AXEL RINGHt, JOHAN KARLSSONt, AND ANDERS LINDQUIST* t 

Abstract. The rational covariance extension problem (RCEP) is an important problem in sys- 
terns and control occurring in such diverse fields as control, estimation, system identification, and 
signal and image processing, leading to many fundamental theoretical questions. In fact, this in¬ 
verse problem is a key component in many identification and signal processing techniques and plays 
a fundamental role in prediction, analysis, and modeling of systems and signals. It is well-known 
that the RCEP can be reformulated as a (truncated) trigonometric moment problem subject to a 
rationality condition. In this paper we consider the more general multidimensional trigonometric 
moment problem with a similar rationality constraint. This generalization creates many interesting 
new mathematical questions and also provides new insights into the original one-dimensional prob¬ 
lem. A key concept in this approach is the complete smooth parametrization of all solutions, allowing 
solutions to be tuned to satisfy additional design specifications without violating the complexity con¬ 
straints. As an illustration of the potential of this approach we apply our results to multidimensional 
spectral estimation and image compression. This is just a first step in this direction, and we expect 
that more elaborate tuning strategies will enhance our procedures in the future. 

Key words. Covariance extension, trigonometric moment problem, convex optimization, gen¬ 
eralized entropy, multidimensional spectral estimation, image compression. 


1. Introduction. In this paper we consider the (truncated) multidimensional 
trigonometric moment problem with a certain complexity constraint. Many problems 
in multidimensional systems theory including realization, control, and identihcation 
problems, can be cast in this framework [3]. Other applications of this type are image 
processing [22] and spectral estimation in radar, sonar, and medical imaging [71]. 

More precisely, given a set of complex numbers Ck, k G A, where k := (fci,..., kd) 
is a vector-valued index belonging to a specified index set A C find a nonnegative 
bounded measure dfi such that 


Ck 


[ for alike A, 


( 1 . 1 ) 


where T := (—7r,7r], G := {6i,...,9d) G and (k, 0) := ^he scalar 

product in Moreover, let e*® := (e*^A ..., 6*®“*). By the Lebesgue decomposition 
[67, p. 121], the measure d/x can be decomposed in a unique fashion as 


dfi{e) = 4>(e*®)dm(6>) -b dfi{e) 


(1.2a) 


into an absolutely continuous part ^dm with spectral density $ and Lebesgue measure 

d 

dm{9) ■= R dOj 

i=i 
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and a singular part dfi containing, e.g., spectral lines. This is an inverse problem, 
which in general has infinitely many solutions if one exists. A first problem of interest 
to us in this paper is how to smoothly parametrize the family of all solutions that 
satisfy the rational complexity constraint 

where P,Q G ^h.\{ 0}, (1.2b) 

where is the convex cone of positive trigonometric polynomials 

= (1.3) 

keA 

that are positive for all 6 € T'^, and is its closure; will be called the positive 
cone. Moreover, we use the notation := *P+\^+ for its boundary; i.e., the subset 
of P € *P+ that are zero in at least one point. In this paper we develop a theory based 
on convex optimization for this problem. 

For d = 1 and A = {0,1,... ,n} this trigonometric moment problem with com¬ 
plexity constrains is well understood, and it has a solution with d/i = 0 if and only if 
the Toeplitz matrix 


Co 

C-1 . 

■ ■ C—n 

Cl 

Co 

C—n+1 

Cn 

Cn—1 

Co 


is positive definite [50]. Such a sequence, cq, ..., c„, will therefore be called a positive 
sequence in this paper. 

In his pioneering work on spectral estimation, J.P. Burg observed that among all 
spectral densities $ satisfying the moment constraints 








fc = 0,1,..., n, 


the one with maximal entropy 


(1.4a) 




(1.4b) 


is of the form $(e*®) = 1/Q(e*®), where (5(e*®) is a positive trigonometric polynomial 
[4,5]. Later, in 1981, R.E. Kalman posed the rational covariance extension prob¬ 
lem (RCEP) [38]: given a finite covariance sequence cq, ... ,c„, determine all infinite 
extensions c„+i, c„+ 2 ,... such that 


OO 

k— — oo 

is a positive rational function of degree bounded by 2n. This problem, which is 
important in systems theory [50], is precisely a (one-dimensional) trigonometric mo¬ 
ment problem with the complexity constraint (1.2b). The designation ‘covariance’ 
emanates from the fact that cq, ci, C2,..., can be interpreted as the covariance lags 
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E{y{t + k)y{t)} = Cfc of a wide-sense stationary stochastic process y with spectral 
density 

In 1983, T.T. Georgiou [29] (also see [30]) proved that to each positive covariance 
sequence and positive numerator polynomial P, there exists a rational covariance ex¬ 
tension of the sought form (1.2b). He also conjectured that this extension is unique and 
hence gives a complete parameterization of all rational extensions of degree bounded 
by 2n. This conjecture was first proven in [16], where it was also shown that the com¬ 
plete parameterization is smooth, allowing for tuning. The proofs in [16,29,30] were 
nonconstructive, using topological methods. Later a constructive proof was given in 
[11,12], leading to an approach based on convex optimization. Here <i> is obtained as 
the maximizer of a generalized entropy functional 

^P(e*®)log<i>(e*«)^ (1.5) 

subject to the moment conditions (1.4a), and the problem is reduced to solving a dual 
convex optimization problem. Since then, this approach have been extensively studied 
[6-8,12,24,25,31,49,56,58,64,66,75], and the approach has also been generalized to 
a quite complete theory for scalar moment problems [9,10,13,14,34]. Moreover a 
number of multivariate counterparts, i.e., when $ is matrix-valued, have also been 
solved [1,2,28,33,48,59,60,74]. 

A considerable amount of research has also been done in the area of multidimen¬ 
sional spectral estimation; for example. Woods [73], Ekstrom and Woods [23], Dick¬ 
inson [20], and Lev-Ari et al. [46] to mention a few. Of special interest is also results 
by Lang and McClellan [42-45,53,54], as they consider a similar entropy functional. 
In many of these areas it seems natural to consider rational models. Nevertheless, the 
multidimensional version of the RCEP has only been considered at a few instances, 
for the two-dimensional case in [32, 33] and in the more general setting of moment 
problems with arbitrary basis functions in our recent paper [41]. 

The purpose of this paper is to extend the theory of rational covariance extension 
from the one-dimensional to the general d-dimensional case and to develop methods for 
multidimensional spectral estimation. In Section 2 we summarize the main theoretical 
results of the paper. This includes the main theorem characterizing the optimal 
solutions to the weighted entropy functional, which is then proved in Section 3. In 
Section 4 we prove that under certain assumptions the problem is well-posed in the 
sense of Hadamard and provide comments and examples related to these assumptions. 
In Section 5 we consider simultaneous matching of covariance lags and logarithmic 
moments, and Section 6 is devoted to a discrete version of the problem, where the 
measure dy consists of discrete point masses placed equidistantly in a discrete grid 
in This is a generalization to the multidimensional case of recent results in [49] 
and is motivated by computational considerations. In fact, these discrete solutions 
provide approximations to solutions to moment problems with absolutely continuous 
measures and allow for fast arithmetics based on the fast Eourier transform (EFT) 
(cf. [64]). Finally, Sections 7 and 8 are devoted to two examples of how the theory 
can be applied; the first in system identification and the second in image compression. 

2. Main results. Given the moments {ckjkeA) the problem under considera¬ 
tion is to find a positive measure (1.2) of bounded variation satisfying the moment 
constraint (I.l). Let us pause to pin down the structure of the index set A. In view 
of (1.1), we have c_k = Ck, where “ denotes complex conjugation. Revisiting the 
one-dimensional result [13-15] for moment problems with arbitrary basis functions. 
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we observe that the theory holds also for sequences with “gaps”, e.g., for a sequence 
Co,, Cfc_i, Cfe+i,..., c„. As seen in [41] this observation equally applies to the mul¬ 
tidimensional case. Therefore, we shall consider covariance sequences {ckjkeA) where 
A c Z"* is any finite index set such that 0 € A and — A = A. We will denote the 
cardinality of A by |A|. Further, let rij = maxj/cj j k G A} denote the maximum range 
of A in dimension j. 

Next, given the inner product 


(C,P) = CkPk, 
keA 


we define the open convex cone 

€+ := {c I {c,p) > 0, for all P€^+\ {0}} , 

the closure of which, £+, is the dual cone of *P+, with boundary d€+. 

We now extend the domain of the generalized entropy functional in (1.5) to mul¬ 
tidimensional nonnegative measures of the type (1.2) and consider functionals 

lp{dn)= [ P(e*®)log$(e*®)dm(6>), (2.1) 

where is the absolutely continuous part of This functional is concave, but not 
strictly concave since the singular part of the measure does not influence the value. 
This leads to the optimization problem to maximize (2.1) subject to the moment 
constraints (1.1). Since the constraints are linear, this is a convex problem. However, 
as it is an infinite-dimensional optimization problem, it is more convenient to work 
with the dual problem, which has a finite number of variables but an infinite number 
of constraints. In fact, the dual problem amounts to minimizing 

Jp(Q) = (c, 9 )- / P(P®) log g(P®)dm (2.2) 

Jfd 

over all Q € *P+, and hence g(e®®) > 0 for all 0 € T"^. Note that (2.2) takes an 
infinite value for Q = 0. 

Theorem 2.1. For every c G £+ and P G *P+\{0} the functional (2.2) is strictly 
convex and has a unique minimizer Q G \ {0}. Moreover, there exists a unique 
c G and a nonneqative sinqular measure du with support suppidu) C |0 G I 
g(e*») = 0} such that 


and 


Ck = / ( —dm + d/t ) for allh G A 

JTd \Q ) 


Ck = /" e''^^'^^dfi, for all k G A. 

JTd 


^Note that the absolutely continuous part is uniquely defined by the Lebesgue decomposition, and 
hence the function Ip (dp.) is uniquely defined. Moreover, this definition of Ip (dp) can be motivated 
by the fact that limn->-oo fjd log(^(e^^)+/n(^))dm(0) = fjd log($(e*^))dm(0) for any log-integrable 
^ and nonnegative “good kernel” fniO) (see, e.g., [70, p. 48]). See also the discussion in Section 3.2. 
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For any such dfi, the measure dfj,{6) = {P{e^^)/Q{e^^))dm{6) + dfl{6) is an optimal 
solution to the problem to maximize ( 2 . 1 ) subject to the moment constraints ( 1 . 1 ). 
Moreover, djX can be chosen with support in at most |A| — 1 points. 

Corollary 2.2. Let c G £+. Then, for any 

dL = ^dm, P,Q € ^+\{0} 

satisfying the moment condition (1.1), Q is the unique minimizer over^+ of the dual 
functional (2.2). 

This corollary implies that, for any c G £+, any measure dfi with only absolutely 
continuous rational part matching c can be obtained by solving ( 2 . 2 ) for a suitable 
P. However, although c G £+, not all P result in an absolutely continuous solution 
dfi = {P/Q)dm that satisfies (1.1). Nevertheless, the case when this happens is of 
particular interest. 

Corollary 2.3. Suppose that d < 2. Then, for any c G £+ and P G Cp+ 
there exists a Q € fp+ such that dp, = {P/Q)dm satisfies (1.1). Moreover this Q is 
the unique solution to the strictly convex optimization problem to minimize the dual 
functional (2.2) over all Q G 

This result can be deduced from the early work of Lang and McClellan [44], 
although they do not consider rational solutions explicitly, nor parameterizations of 
them. Note that Corollary 2.3 is only valid for P G fp+, while Theorem 2.1 holds 
for all P G *P+ \ {0}. This will be further discussed in Section 4, where the proof of 
Corollary 2.3 will also be concluded. 

2.1. Covariance and cepstral matching. It follows from Theorem 2.1 and 
Corollary 2.3 that Q is completely determined by the pair (c, P). For d = 1 the choice 
P = 1 leads to Burg’s formulation (1.4), which has been termed the maximum-entropy 
(ME) solution. On the other hand, better dynamical range of the spectrum can be 
obtained by taking advantage of the extra degrees of freedom in P. Several methods 
for selecting P have been suggested in the one-dimensional setting. Examples are 
methods based on inverse problems as in [26,39,40], a linear-programming approach 
as in [6,7], and simultaneous matching of covariances and cepstral coefficients as in 
[55] and independently in [6,7,24,49]. Here, in the multivariate setting, we consider 
the selection of P based on the simultaneous matching of logarithmic moments. 

We define the (real) cepstrum of a multidimensional spectrum as the (real) loga¬ 
rithm of its absolutely continuous part. The cepstral coefficients are the corresponding 
Fourier coefficients 


7k 



edk.0) log $(e*»)dTO( 6 /), for k G A \ {0}. 


(2.3) 


For spectra that only have an absolutely continuous part this agrees with earlier 
definitions in the literature (see, e.g., [57, pp. 500-507] or [19, Chapter 6 ]). 

Given a set of cepstral coefficients we now also enforce cepstral matching of the 
sought family of spectra. This means that we look for $ = P/Q that also satisfy (2.3). 
Note that the index k = 0 is not included in (2.3). In fact, for technical reasons, we 
shall set 70 = 1. Also to avoid trivial cancelations of constants in P/Q, we need to 
introduce the set 


';5+,o := {P G q3+ 1 po = 1}. 
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Theorem 2.4. Let 7k, k € A \ {0}, be any sequence of complex numbers such 
that 7_k = 7k, o,nd set 7 = {7k}keA where 70 = 1. Then, for c G £+, the convex 
optimization problem (D) to minimize 

^{P,Q) = {c,q) - {'y,p) + Flog dm (2.4) 

subject to {P,Q) G X has an optimal solution {P,Q). If such a solution 

belongs to *P+,o x Cp+, then ^ = P/Q satisfies the logarithmic moment condition 
(2.3) and dp = ^dm the moment condition (1.1). Moreover, $ is also an optimal 
solution to the problem (P) to maximize 

!($) = j log $ dm (2.5) 

subject to (1.1) and (2.3) for dp = ^dm. Finally, if d <2, then P G ip+,o implies 
that Q G *P+. 

For reasons to become clear in Section 5, the optimization problems (P) and 
(D) will be referred to as the primal and dual problem, respectively. A drawback 
with Theorem 2.4 is that even when d < 2, a solution to the dual problem can be 
guaranteed to have a rational spectrum that satisfies (1.1) and (2.3) only if P G ^+,o- 
In fact, as we shall see in Section 5, for a solution with P G 9fP+,o we might have 
Q G 9*P+ and hence covariance mismatch. A remedy in the case d < 2 is to use 
the Enqvist regularization, introduced in the one-dimensional setting in [24]. This 
makes the optimization problem strictly convex and forces the solution P into the set 
^+.0. In this way we obtain strict covariance matching and approximative cepstral 
matching. This statement will be made precise in Theorem 5.7 in Section 5.1. 

2.2. The circulant covariance extension problem. In the recent paper [49], 
Lindquist and Picci studied, for the case d = 1, the situation when the underlying 
stochastic process y{t) is periodic. For the A^-periodic case, the covariance sequence 
must satisfy the extra condition CN-k = c^; i.e., the N x N Toeplitz matrix of one 
period is Hermitan circulant. In this case, the spectral measure must be discrete with 

i— 

point masses at = e n , £ = 0,1 ,..., — 1, on the discrete unit circle, and instead 

of the moment condition (1.1) we have 

N-l 

(2-6) 

^ £=0 

which is the inverse discrete Fourier transform of the sequence (4>(C£)). 

This was generalized to the multidimensional case in [65], where a circulant version 
of Theorem 2.1 and Corollary 2.3 was derived. For N := (A^i,... ,Nd), consider the 
discretization of the d-dimensional torus 

■P 2lIL -d 2lIL 

:= (e^^Wi,...,e ‘'w.i) 


where 


:= {£ = (£ 1 ,... ,4) I 0 < £j <Nj-l,j = l,...,d} 
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and define = 11^=1 Next, let Cp+(N) be the positive cone of all trigonometric 
polynomials (1.3) such that P{Ct) > 0 for all i € Moreover, define the interior 
£+(N) of the dual cone as the set of all {ckjkeA such that (c,p) > 0 for all P € 
$+(N) \ {0}. Clearly Cp_|_(N) D *p_|_, and hence £+(N) C £+. Then Theorem 2 and 
Corollary 3 in [65] can be combined in the following theorem. 

Theorem 2.5 ([65]). Suppose that 2nj < Nj, for j = 1,..., d, and let c € £+(N) 
and P € *P+(N) \ {0}. Then, there exist a Q G ^+(N) \ {0} sueh that Q is a solution 
to the convex problem to minimize^ 

JN(g) = (c, q) - / ^ P(C,) log g(C,) 


over all Q € fp+(N). Moreover, there exists a nonnegative function fi with support 
supp(/i) = {Cg I QiCi) = 0, £ G such that 


Ck 





(2.7) 


and the number of mass points for fi can be chosen so that at most | A| — 1 points piCi) 
are nonzero. Finally, if P G fp+(N) then Q G ip+(N), which is then also unique, and 
hence ^ = P/Q satisfies (2.7) with /t = 0. 

In [49] it was shown in the one-dimensional case that as fV —)■ oo the solution 
of the discrete problem, call it Qat, converges to the solution to the corresponding 
continuous problem, call it Q. This gives a natural way to compute an approximate 
solution to the continuous problem using the fast computations of the discrete Fourier 
transform. The same holds also in higher dimensions, as seen in the following result. 

Theorem 2.6. Suppose that P G \ {0} and c G €+, and let Q and Qn be the 
optimal solutions of Theorem 2.1 and Theorem 2.5, respectively. Then 

lim Qn = Q 

min(N)—>-oo 

uniformly. 

3. The Multidimensional rational covariance extension problem. Most 
of this section will be devoted to proving Theorem 2.1. Some technical details are 
deferred to the appendix. Possible interpretations of P will be discussed in the end of 
the section together with an example showing the non-uniqueness of the measure dp. 

3.1. Proof of Theorem 2.1. 

3.1.1. Deriving the dual problem. For a given P G \ {0} and c G £+, 
consider the primal problem to maximize (2.1) subject to the moment constraints 
(1.1) over the set of nonnegative bounded measures, i.e., over dp = ^dm + dp, where 
$ is a nonnegative L^(T'^) function and dp is a nonnegative singular measure. The 
Lagrangian of this problem becomes 

Cp{^,dp,Q) = f P log <i)dm + ^ dk ( Ck — [ -I-d/1) 


^Note that limits such as PloglQ) and P/Q may not be well defined in the multidimensional 
case, and therefore we define the expressions P\og{Q) and P/Q to be zero whenever P = 0. This is 
not needed in the continuous case as the set where P is zero is of measure zero. 
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where k € A, are Lagrange multipliers. Identifying X^keA with the 

trigonometric polynomial Q, this can be simplified to 

Cp{^,dfj,,Q) = / Plog ^ dm + {c,q) — / Q^dm — / Qd/i. 

Jjii Jjd Jjd 

The dual function £p{^,dfi,Q) is finite only if Q G ^ 4 . \ {0}. To see this, 

let Q ^ ^+, i.e., suppose there is 0o € for which Q{Oo) < 0. Then, by letting 
fi{Oo) —)■ 00 in the singular part d/i, we get that Cp{^,dfl,Q) —)■ 00 . Moreover, if 
(5 = 0 then since P is continuous and P ^ 0 there is a small neighbourhood where 
P > 0. Letting $ —)• 00 in this neighbourhood we again have that >Cp($, dfi, Q) —)■ 00 . 
Hence we can restrict the multipliers to \ {0}. 

Now note that any pair ($, d/t) maximizing £p($, d/i, Q) must satisfy Qdjl = 
0, or equivalently, the support of dfi is contained in {0 G | (5(e*®) = 0}. Otherwise 
letting d/t = 0 would result in a larger value of the Lagrangian. 

Note that the value of the Lagrangian becomes —00 for any $ that vanishes on a 
set of positive measure, and hence such a <I> cannot be optimal. Now, for any direction 
dd) such that $ + edd) is a nonnegative L^(T‘^) function for sufficiently small e > 0 , 
consider the directional derivative 

d£p($, d/i, (5; dd)) = lim - (£p(<i) + d<I),d/t,(5)—>Cp(d>,d/t,(5)) = [ f -(5 ) d<i)dm. 

For a stationary point this must be nonpositive for all feasible directions d$, and in 
particular this holds for dd) = $ sign(P — ( 5 d)) which by construction is a feasible 
direction. For this direction, the constraint becomes \P — Q^\dm < 0, requiring 
that ^ = P/Q a.e., which inserted into the dual function yields 

sup £p($, d/i, (5) = Jp((5) + [ PilogP-l)dm, (3.1) 

dfj.>0 Jjd 

where he last term in (3.1) does not depend on Q and 

Jp(Q) = (c, g) - / PlogQdm. (3.2) 

JTd 

Hence the dual problem is equivalent to minimizing Jp over \ {0}. 

3.1.2. Lower semicontinuity of the dual functional. For any Q G fp+, 

Jp((5) is clearly continuous. However, for Q G log (5 will approach —00 in 

the points where ( 5 (e*®) = 0 , and hence we need to consider the behavior of the 
integral term in (3.2). Since P is a fixed nonnegative trigonometric polynomial, it 
suffices to consider the integral log Q dm. However, this integral is known as the 
(logarithmic) Mahler measure of the Laurent polynomial Q [52], and it is finite for all 
Q G *P+ \ {0} [ 68 , Lemma 2, p. 223]. This leads to the following lemma, the proof of 
which is deferred to the appendix. 

Lemma 3.1. For any P G *p+\{0} and c G £+, the functional S p : *P+\{0} —>■ M 
is lower semicontinuous. 

3.1.3. The uniqueness of a solution. From the first directional derivative 

djp(( 5 ; d( 5 ) = (c, dg) - [ ^SQdm 

JTd Q 
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of the dual functional (3.2), we readily derive the second 

6^Ip{Q;SQ)= f ^{SQfdm, 

which is clearly nonnegative for all variations 5Q. Therefore, since, in addition, the 
constraint set is convex, the dual problem is a convex optimization problem. To see 
that Jp is actually strictly convex, note that since P is positive almost everywhere, 
so is PjQ^. Therefore, for S^Ip{Q;6Q) to be zero we must have SQ = 0 almost 
everywhere, which implies that it is zero everywhere since it is continuous. This 
implies that if there exists a solution, this solution is unique. 

3.1.4. The existence of a solution. If we can show that Jp has compact sub- 
level sets, then Jp must have a minimum since it is lower semicontinuous (Lemma 3.1). 

Lemma 3.2. The sublevel sets Jp^(— oo,r] are compact for all r C M. 

For the proof of Lemma 3.2 we need the following lemma modifying Proposition 
2.1 in [14] to the present setting. 

Lemma 3.3. For a fixed c G £+, there exists an s > 0 such that for every 
(p,Q)e (^+\{0})x ($+\{0}) 

Jp((5) > ellQIloo - [ Pdm log llQIloo- (3.3) 


Proof. Since (c, q) is a continuous function, it achieves a minimum on the compact 
set {Q G \ {0} I Ijglloo = 1}, where Hgjloo := maxkeA I'i'kl- The minimum value 
Kc must be positive since c G £+ and hence (c, q) > 0 for any q G \ {0}. For any 
Q G *P+ \ {0} we thus have 

{c,q) = (c, 11 ^, )||g||oo > Kclkiloo- (3.4) 

Ikiloo 

By Lemma A.l, ||(5||oo < |A|||(7||oo , and hence by choosing e < Kc/|A| we get 

{c,q) > Kclkiloo > j^llQIloo > ellQIloo- (3.5) 

To obtain a bound on the second term in (3.2), we observe that 

dm+ PdTOlog||Q||oo< / PdmlogWQWoo, 

since Q/||(5||oo < 1- Hence (3.3) follows. □ 

Proof of Lemma 3.2. For any r G M, large enough for the sublevel set {Q G 
\ {0} I r > Jp((5)} to be nonempty. 


/ P log Qdm = P log 

Ijd Jjd 


Q 


IIQIIc 


r > Jp(Q) > ellQIloo - [ Pdm log ||Q||oo 

for some e > 0 (Lemma 3.3). Comparing linear and logarithmic growth we see that 
the sublevel set is bounded both from above and from below. Moreover, since Jp 
is lower semicontinuous (Lemma 3.1), the sublevel sets are also closed [67, p. 37]. 
Therefore they are compact. □ 
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3.1.5. Existence of a singular measure. It remains to show that there exists 
a measure dfi prescribed by the theorem and that d/i = P/Qdm + dfi is in fact an 
optimal solution to the primal problem to maximize (2.1) subject to the moment 
constraints (1.1). To this end, we invoke the KKT-conditions [51, p. 249] for the dual 
optimization problem, which require that the functional 


Lp{Q,djl) = {c,q) 



P\og{Q)dm 



Qdjl. 


is stationary at Q for some nonnegative measure^ dp, and that the complementary 
slackness condition Qdp = 0 holds so that supp(d/i) C {0 g | Q{e^^) = 0}. 
Applying the Wirtinger derivatives [62, pp. 66-69] 


dz ~ 2\dx^^&y) ' 


where z = x + iy is a complex variable, we obtain 


dLp{Q, dp) 
dqk 


= Ck - / e 
JT'i 


i(kM) 


dm + dp 


from which we see that a stationary point must satisfy the moment condition (1.1). 
This shows that there exists a singular measure dp with the properties prescribed in 
the first part of the proof, such that dy = P/Qdm + dp matches the covariances, and 
we may therefore take dp = dp. Next, for k € A, we define 


Ck := 



T-i Q 


(3.7) 


from which we see that c is unique, although dp might not be. For a Q € \ {0}, 


(c, q) = Ck^k = 

keA 


V (" /" P^^’^^^dp") <?k = / Qdp, 
^ Kdr-i J 


which shows that (c, g) > 0 for all Q € \ {0}, and thus c € £+. However, for Q we 

have {c,q) = Qdp = 0 by complementary slackness, which shows that c € 9£+. 
Moreover, it is shown in [43] that there exists a discrete representation with support 
in |A| — 1 points for all c € (9£+. To show that the solution is optimal also for the 
primal problem we observe that, for all d/x = ‘hdm -I- dp, 


lp{^) < Lp{^,dp,Q) <Sp{Q) + [ P(logP-l)dm. 

Since equality holds for the feasible point d/i = {P/Q)dm + dp, optimality follows. 
This completes the proof of Theorem 2.1. 

An alternative proof of the results in Sections 3.1.2-3.1.4 can be constructed along 
the lines of [27, Section 5]. In the proof of that paper they use the existence of a 
coercive spectral density, which in our case follows from the existence of a spectral 
density in the exponential family [33]. Also compare this with the proofs of Theorem 
5.1 and Theorem 5.2 in [41], which deals with a more general setting. 

®Note that by Rietz’s representation theorem (for periodic functions), the dual of CIT'*) is the 
space of bounded measures on T'* [51, p. 133]. 
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3.2. Comments and an example. In the one-dimensional case it has already 
been observed that P need not be confined to the cone \ {0} but could be a 
general nonnegative integrable function with zero locus of measure zero [13,14]. This 
fact was implemented in [34] to interpret the functional (1.5) as a Kullback-Leibler 
pseudo-distance between P and 4) and hence with P as a, Kullback-Leibler prior. In 
fact, maximizing (1.5) is equivalent to minimizing the Kullback-Leibler divergence 


D(P1|4>) 



dm, 


which is nonnegative for functions with the same total mass and equal to zero only 
when the functions are equal. In our present more general setting, P could be any 
absolutely integrable, nonnegative function for which the set {0 € | P(e*®) = 0 } 

has measure zero. In this context it is also possible to interpret the functional (2.1) 
as a Kullback-Leibler distance, not only between the two functions P and $, but 
between the two measures dp := Pdm and dp. Since dp is absolutely continuous with 
respect to dp we obtain [63] (see, in particular, equation 3.1) 



where {dp/dp) = P/4> is the Radon-Nikodym derivative. 

Except in the one-dimensional case, the singular part of the measure is in gen¬ 
eral not unique. To illustrate this fact, we consider the following example in two 
dimensions, similar to Example 5.4 in [41], where Q has zeros along a line. 

Example 3.4. Given A = {(0,0), (-1,0), (1,0), (0,-1), (0,1), (-1,-1), (1,1), 
(— 1 , 1 ), ( 1 ,- 1 )}, consider 

P(e*«i,e*^^) = (l-cos 6 »i), 

( 5 (e*^be*®^) = (1 — cos 0 i )(2 — cos 6 * 2 ). 


Let c be the covariances of the spectrum <I> = P/Q, i.e., cgp = l/v^, ci^o = 0, co,i = 
-l + 2/^/3, ci.i = 0 and c_ip = 0 , the remaining covariances being uniquely deter¬ 
mined by the conjugate symmetry c_k = Ck. Moreover, let c be given by 

Ck= / 

Jj2 


so that co,o = 1, ci^o = !> co,i = 0, cip = 0 and c_ip = 0. Clearly P,Qg fp+, and 
thus c e £+ since 


'•) = Y1 J^ dm{9) > 0 


keA 


/T 2 


Q(e*e) 


for any R G \ {0}. In the same way, 


{c,q) = [ Q{e^^)6{9i)dm{9) = [ {1 — cos0i)d{9i)d9i f (2 — cos^ 2 )= 0 


Ij2 


27r 


and thus c G 9£+. Hence, (Q, c) is the unique pair prescribed by Theorem 2.1 for the 
covariance sequence c + c and the numerator polynomial P. However, since Q is zero 
for 9i = 0 , any measure dp with support constrained to the line 9i = 0 and mass 1 
such that fj.2 cos d^dp = 0 is a solution. 
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4. Well-posedness and counter examples. The intuition behind Corollary 
2.3 is that the optimal solution Q is repelled from the boundary by the following 
assumption (Assumption 4.1) whenever P G Then, since the measure djl can 

only have mass in the zeros of Q, we must have dfl = 0. 

Assumption 4.1. The cone has the property 

( ^d'm{6) = oo for all Q € d^+- 

JT'i Q 

As noted in [14], Assumption 4.1 always holds in the one-dimensional case {d= 1), 
since the trigonometric functions are Lipschitz continuous. Using results by Georgiou 
[32, p. 819] it can be shown that this assumption is also valid for d = 2. However, Lang 
and McClellan [44] note that Assumption 4.1 does not hold in general for dimensions 
d > 3 . To see this, they consider the polynomial (3(e®®) = ~ cos 6 ^) G i9fp+ 

and show that ^dx < oo for d > 3. In fact, we have the following amplihcation 
of this fact, the proof of which we defer to the appendix. 

Proposition 4.2. For d > 3, Assumption 4-1 does not hold if the index set A 
contains at least three linearly independent vector-valued indices. 

Observe that a problem of dimension d > 3 for which A contains less than three 
linearly independent vector-valued indices trivially reduces to a problem in one or 
two dimensions. Hence in general we identify Assumption 4.1 with the case d < 2. 
Corollary 2.3 now follows directly from the following lemma. 

Lemma 4.3. Let P G and suppose that Assumption 4-1 holds. Then the 
optimal solution Q to the problem to minimize (2.2) over all Q G belongs to 

Proof. Let Q G be arbitrary. Then, for any p > 0, Q(e*®) -I- p > 0 for all 

6 G T‘^. Hence the functional Jp is also differentiable in Q p, and the directional 
derivative in the direction 1 is 

dJp(Q-bp;l) = (c, 1) - /" ^ dm. 

Jjd Q-\- p 

Now note that Pf{Q + p) is nonnegative in all points, that it is pointwise monotone 
increasing for decreasing values of p, and that it converges pointwise in extended real¬ 
valued sense^ to P/Q. Hence by Lebesgue’s monotone convergence theorem [67, p. 
21 ] we have, as p —> 0, 


[ -^dm [ ^dm, 

Jjd Q p Jfd Q 

which, since P G fp+, is infinite by Assumption 4.1. Therefore 1 is a descent direction 
from the point Q, and hence the optimal solution is not obtained there. Since Q G 
is arbitrary, this means that the optimal solution is not attained on the boundary, 
i.e., we have Q G fp+. □ 

It turns out that the multidimensional rational covariance extension problem for 
d < 2 is in fact well-posed in the sense of Hadamard, i.e., the solution depends 
smoothly on c and P, which is an important property when it comes to tuning of 
solutions to design specifications. This follows from the following generalizations to 
the multidimensional case of Theorems 1.3 and 1.4 in [14], proved in the appendix. 

Theorem 4.4. Let P : -G £+ be the map from Q to c, given component-wise 


■^That is, the limit may be oo. 
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hy 


Ck = 


Jjd 




for a fixed P G If d <2, /p is a diffeomorphism. 

Theorem 4.5. Suppose that d < 2. Let /p he as in Theorem f.f, and let 
c G £+ be fixed. Then the function g'^ : —> Cp+ mapping P to Q = (/^)~^(c) is a 

diffeomorphism onto its image £3+. 

By Corollary 2.3, the unique solution Q of the dual problem belongs to the interior 
*33+ for every pair (c, P) G £+ x if Assumption 4.1 holds. Note that, while the 
more general Theorem 2.1 holds for all P G \ {0}, Corollary 2.3 is only valid for 
P G ^ 4 -. The reason for this is that if P G the directional derivative of Jp tends 
to — oo on the boundary by Assumption 4.1, so a minimum is not attained there, as 
we just saw in the proof of Lemma 4.3. On the other hand, if P G we have 

JjdiP/Q)dm < oo for some Q G take for example Q = P. More generally, the 

integral may not diverge if the zeros of Q belong to a subset of the zeros of P. In 
this case, there is no guarantee that the optimal solution is an interior point. The 
following simple one-dimension example illustrates this. 

Example 4.6. Consider a one-dimensional problem of degree one, i.e., with A = 
{ — 1, 0, 1}. Fix c = (1, Cl), where ci G (—1, 0) is arbitrary. Clearly the Toeplitz matrix 
T(c) = is positive definite, and hence c G £+. We fix P(e*^) = 2-|-e*^-|-e“*^, 

which belongs to since P{e^^) = 0. We want to find a Q G of degree at most 
one so that ^ = P/Q matches the covariance sequence c, i.e. 


Cfc 


Ak9 


dm, fc = 0,1. 


(4.1) 


Any such Q must have the form Q{e^^) = A(1 — pe*^)(l — pe *®) for some A > 0 and 
IpI < 1. Now, clearly 


$(e*®) = A-i 


2 + e^^ + e-^^ l-|pP 

1 — |/9p (1 — pe*^)(l — pe“*^) ’ 


where the second factor takes the form 
1 1 


1 — pe*® 1 — pe 


-i9 


- 1 = .. 


+ p 2 e- 2 *® 


.pe-*« + l + pe*® + pV*® + . 


which implies that Cq = A“^(2 + p + p)(l — |pp)“^ and Ci = A“^(l -I- p)^(l — |pp)“^. 
Since cq = 1, we have ci = (1-|-p)^(2-|-p-l-p)“^, which has positive, real denominator. 
Then, since ci < 0, 1 -I- p is purely imaginary, which is impossible since 1 -I- p has a 
positive real part. Hence, there is no Q € *P-(- of degree at most one satisfying (4.1). 
However, for a certain Q G namely (5(e*^) = (2-|-e*® + e“*^)/(l + ci), we obtain 

dp, = {P/Q)dm — Ci6{6 — TT)d9, i.e., 

dp = {1 + Ci)dm — cid{0 — Tr)d9, 


which matches c with — 1 < ci < 0. Now $ = 1 -I- ci and the singular measure 
dp = 6{9 — Tr)d9 has all its mass at the zero of Q, as required by Theorem 2.1. 

In this context it is interesting to note that the covariance extension problem is 
usually formulated as a partial realization problem where one wants to determine an 
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extension of the partial covariance sequence c so that 

^ OO 

^+( 2 ) = l^Co+^CkZ-^ 
k=l 


is positive real, i.e., $+ maps the unit disc to the right half of the complex plane; see, 
e.g., [50]. Then <l>+(e*®) + is the corresponding spectral density $(e*^). In 

our example such a solution is provided by 

^+(^) = ^ (1 + '^1 “ ^ \ +ci 2 - + ••• , 


yielding precisely d* = 1 + ci. The singular measure never appears in this framework. 

5. Logarithmic moments and cepstral matching. Given c G £+, Corollary 
2.3 and Theorem 4.5 together provide a complete smooth parameterization in terms 
of P € of all $ = P/Q such that dfi = ^dm satisfies the moment equations 
(1.1). Therefore the solution can be tuned to satisfy additional design specification by 
adjusting P. How to determine the best P is, however, a separate problem. Theorem 
2.4, to be proved next, extends results from the one-dimensional case to simultaneously 
estimate P using the cepstral coefficients and logarithmic moment matching. 

Proof of Theorem 2.4- The proof follows along the same lines as that of Theorem 
2.1. By relaxing the primal problem (P) we get the Lagrangian 


C{^,P,Q)= / log^dm-l- 


jjd 


keA 


Ck - / e 

jjii 




+ H Pk 

keA\{0} 



edk,e) log $ dm - 7k 


(5.1) 


where qk and pk are Lagrangian multipliers. Setting po = 7o = 1 and rearranging 
terms, this can be written as 


, P, Q) = {c, q) - ( Q^dm-f-^^p)+1+ f P log 4) dm, (5.2) 
JT'i Jt<‘ 

where the first term in (5.1) has been incorporated in the last term of (5.2). As 
before, sup 3 >>g P(4>, P, Q) is only finite if we restrict Q to and similarly we need 
to restrict P to *P+,o- Taking the directional derivative of (5.2) in any direction <5$ 
such that 4> -|- ed4> is a nonnegative P^(T‘^) function for all e G (0, a) for a sufficiently 
small a > 0, we obtain 

d/:(4>, P, Q; (5$) = / (P^-Q)S^dm. 

Jid <P 

For the directional derivative to be nonpositive for all feasible directions (54) we need 
4) = P/Q a.e. (cf. Section 3.1.1), which inserted into (5.2) yields 


sup£(4), P, Q) = J(P, Q) -I- 1 — f Pdm, (5.3) 

$ Jjd 

with J(P, Q) given by (2.4). A closer look at the last term in (5.3) shows that 


Pdm = 


Ijd 


Jjd 


keA 




E 

keA 


PkU 

i=i' 


Akj9- 


27T 


= 1 , 
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since all integrals vanish except those for fci = ... = fed = 0. Consequently, J is 
precisely the dual functional (5.3). 

Using the Wirtinger derivative from (3.6) to form the gradient of J, we obtain 


dljP, Q) 
9gk 


= Ck - 



dm, 


k e A, 


(5.4a) 


log dm - 7 k, keA\{0}. (5.4b) 

In deriving (5.4b) we used the fact that 

[ P^^-^^dm = Y\ [ kT^O. (5.5) 

JTi J-jv 27r 

Therefore, if P € *)3+,o and Q € and hence the optimal solution is a stationary 
point of J, then the spectrum $ = P/Q fulfills both covariance matching (1.1) and 
cepstral matching (2.3). 

The following three lemmas ensure the existence of a solution and shows that 
the problem is in fact convex. The arguments are similar to those in the proof of 
Theorem 2.1, and are given in the appendix. 

Lemma 5.1. Given c € £+ and a sequence 7 = { 7 k}keA with 70 = 1 and 7 _k = 
7 k; the functional {P,Q) 1 —t J(P, Q) is lower semicontinuous on $+,o x (*P-|_ \ { 0 }). 

Lemma 5.2. The sublevel sets J“^(—00, r] are compact. 

Lemma 5.3. The dual problem (D) in Theorem 2.4 is convex on the domain 

Next we show that if Q G *P+ and P G fP+,o then $ = P/Q is also optimal for the 
primal problem of Theorem 2.4. This follows by observing that <I> is a primal feasible 
point and that the primal functional (2.5) takes the same values as the Lagrangian 
(5.1) in this point, since we have covariance and cepstral matching (cf. the proof of 
Theorem 2.1). Finally, if d < 2 then Q G fp+ whenever P G fP+,o, which follows 
directly from Lemma 4.3. This concludes the proof of Theorem 2.4. □ 

From this proof we see that the stationarity of I{P,Q) in Q ensures covariance 
matching and the stationarity in P provides cepstral matching. Therefore we can only 
guarantee matching for a solution in the interior fp+,o x This subtle fact was 
overlooked in [7, 24], where it is claimed that we also have covariance matching for 
P G 9*P+.o. However, even when d < 2, we cannot guarantee that there is a solution 
Q belonging to the interior if P G The following example illustrates this. 

Example 5.4. Consider the one-dimensional problem with cq = 2, c_i = ci = 1 
and 7 i = —1. Set 

P(P^) = 1 - + e -*®)/2 = 1 - cos 6 », 

and Q = P. Clearly P and Q belong to the boundary, since P(e®°) = (5(e*°) = 0. 
Moreover $ = P/Q = 1, so there is neither covariance matching nor cepstral matching. 
A simple calculation shows that df/dq^ = df/dqi = df/dpi = 1. However, for any 
feasible direction [dq^, 8qi,5pi) in (P, Q) we have Rejdpi} > 0 and Re{d(;o+2(5(7i} > 0, 
and hence there is no feasible descent direction from this point. Therefore we have a 
local minimum, which, by convexity, is also a global minimum. Consequently, we have 
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an optimal solution on the boundary where we have neither covariance nor cepstral 
matching. 

Remark 5.5. From Theorem 2.1 we know that it is possible to achieve covariance 
matching in this example by adding a nonnegative singular measure dfj,, representing 
spectral lines. In fact, a similar statement can be proved for cepstral matching, 
namely that that there exists a nonpositive measure da such that suppldd) C iO G 

I P{e) = 0 } and 

7k = {\og{P/Q)dm{e) - 

for all k G A \ {0}. However, while the physical interpretation of dfi in Theorem 2.1 
is clear, in this case it is not obvious what dp, represents in terms of the spectrum. 

Note that the optimization problem is convex but in general not strictly convex, 
and hence the solution might not be unique. This is illustrated in the following 
example [50, p. 504]. 

Example 5.6. Again consider a one-dimensional problem, this time with cq = 1, 
c_i = Cl = 0 and 71 = 0. Choosing 

P(e*®) = g(e*") = l-pcos 0 , |p|<l, 

we obtain $ = 1, which matches the given covariances and cepstral coefficients. There¬ 
fore all P and Q of this form are stationary points of J and are thus optimal for the 
dual problem in Theorem 2.4. 

In one dimension there is strict convexity, and thus a unique solution, if and only 
if there is an optimal solution for which P and Q are co-prime [7]. 

5.1. Regularizing the problem. A motivation for simultaneous covariance 
and cepstral matching is to obtain a rational spectrum $ = P/Q that matches the 
covariances without having to provide a prior P. However, even if d < 2, the dual 
problem in Theorem 2.4 cannot be guaranteed to produce such a spectrum that sat¬ 
isfies the covariance constraints (1.1). To remedy this we consider the regularization 
proposed by Enqvist [24], which has the objective function 


I\{P,Q) = IiP,Q) - X [ logPdm, 

where A G (0, 00 ) is the regularization parameter. 

The partial derivative with respect to dk is identical to (5.4a), whereas the partial 
derivative with respect to pk becomes 


dI\{P,Q) 

dpk 




dm — 7 k. 


By Assumption 4.1, this gradient will be infinite for P € 9^+, and hence the optimal 
solution is not on the boundary. Moreover, with this regularization, the optimization 
problem becomes strictly convex and we thus have a unique solution. 

Theorem 5.7. Suppose that d < 2, and let 7 k, k € A \ {0}, be any sequence 
of complex numbers such that 7_k = 7k- Set 7 = {7k}keA where 70 = 1, and let 
c € £+. Then for any A > 0 there exists a unique solution {P,Q) to the strictly 
convex optimization problem to minimize 


^x{P,Q) = {c,q) - {i,p) 




Plog 


dm — A / log P dm 


/Jd 
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subject to P € *P+,o and Q € Moreover, $ = P/Q fulfills the covariance matching 
(1.1) and approximately fulfills the cepstral matching (2.3) via 

7 k + ek= [ <I> dm, where e]^ = X ( 

JTi J'f<i 


Proof. In view of what has been said, all of the results follow from Theorem 
2.4 except the strict convexity. To prove this, we note that the second directional 
derivative of Ja is given by 

S^]ix{P,Q-,6P,dQ)= f p(spI,-SQ^) dm+ f dP^^dm 
Jt‘1 \ P QJ Jt‘1 p 

(cf. the proof of Lemma 5.3 in the appendix). Since both integrands are nonnegative, 
they both need to be zero almost everywhere in order for the derivative to vanish. 
However, since P > 0, this implies that 6P = 0 by continuity. Then the first inte¬ 
grand becomes 6Q^PfQ^ and in the same way we must thus have SQ = 0. Hence 
6‘^f\{P,Q;SP,6Q) > 0, implying uniqueness. □ 

6 . The circulant problem. Theorem 2.5 in Section 2.2 can be viewed as a 
periodic version of Theorem 2.1 and Corollary 2.3, as can be seen by following the 
lines of [49], where the one-dimensional problem was first introduced. To this end, we 
introduce the discrete measure dn-yi, i.e.. 


dm( 0 )= 5 ] 

j=i 


( 6 . 1 ) 


where := 2Tr£ /Nj and <5 is the multidimensional Dirac-delta function. Then the 
moment matching condition (2.7) takes the form 

Ck = E 

rii^i 

which is similar to ( 1 . 1 ), but where dn-yi and dm have different mass distributions 
(discrete versus continuous). In fact, the main difference in the statements of Theorem 
2.5 and Theorem 2.1 together with Corollary 2.3 is that different measures and cones 
are used. In the same way, versions of Theorems 2.4 and 5.7 also hold in the circulant 
case; see [65] for details. 

In connection to this it is also interesting to observe that the discrete counterpart 
of Assumption 4.1, 

I ^dn^ = oo for all Q G cIfp+(N), (6.2) 

Jt-i Q 

holds for any measure with discrete mass distribution (see also [44]). How¬ 
ever, if P € 9tp+(N) we may still obtain solutions without covariance matching, 
because for any Q that is zero only in a subset of points where P is zero we will have 
jfd(,P/Q)dvT<i < oo and hence the optimal solution may occur on the boundary. 

Remark 6.1. Although the measure (6.1) has mass in points placed in the roots 
of unity on the d-dimensional torus, one could also consider other mass distributions. 
One could place the mass points in the odd points of the roots of unity, i.e., in the 
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points ^ situation which has been studied in the one-dimensional 

case and which correspond to spectra of skew-periodic processes [ 66 ]. The same holds 
in the multidimensional setting. Also note that all dimensions does not need to have 
mass distributions of the same type. For example, the approach in this paper works 
even if the process is periodic in some of the dimensions, while non-periodic in others. 

6.1. Convergence of discrete to continuous. In [49] Lindquist and Picci 
proved for the one-dimensional case that when the number of mass points in the 
discrete measure dvjsi in ( 6 - 1 ) goes to infinity, the solution converges to the solution of 
the problem with the continuous measure dm. The same is true in higher dimensions, 
and the formal result is given in Theorem 2.6 in Section 2.2. In this subsection we 
will prove this statement. Note that we use the notation 

Jp(Q) = (c, g) - [ PlogQdm 
Jp (Q) = (c, g) - [ PlogQdPN 

Jjii 

to explicitly distinguish the objective functions using the continuous and the discrete 
measure. Moreover let Q be the minimizer of (6.3a), subject to Q G and Qn be 
a minimizer of (6.3b), subject to Q G *|lp(N). Before proving the theorem, we make 
some clarifying observations. 

Remark 6.2. We have already noted that the singular measure djX is not unique. 
However, the corresponding “rest covariance” c, which dfj, matches, is unique (cf. 
equation (3.7)). In connection to this it is interesting to note that although this is 
the case, and although Qn —t Q, in general cn c. To see this, note that for a P 
which is positive in all points except for some irrational frequency^ where P = 0 , we 
will have P G fp+(N) for all N, since this point will never belong to the grid. Thus 
we will have Qn G fp+(N) and therefore cn = 0. However P G and therefore 

we can have Q G and hence c 7 ^ 0. One can construct such example based on 
Example 4.6 by shifting the spectral line to an irrational frequency point. 

Remark 6.3. In connection to the previous remark, we note that in two dimensions 
we have Q G whenever P ^ since Assumption 4.1 is valid for d = 2. Hence 
there will be no singular measure. Moreover, since Qn —>■ Q as min(N) goes to 
infinity, for large enough value of min(N) we must have Qn > 0, i.e., Qn G <P+. 
Therefore (P/(5N)dPN tends to {P/Q)dm in weak*. 

The first thing we need to show is that Qn is in fact well-defined. That this is 
not evident from the statement of the theorem becomes apparent when noting the 
following relationship among the cones of trigonometric polynomials: 


(6.3a) 

(6.3b) 


‘P+(N)d^+(2N)d...d^+. 


For the dual cones we therefore have [51, pp. 157-158] 

£+(N) c £+(2N) c ... C £+, 

and thus it is not guaranteed that minimizing (6.3b) over Q G ip+(N) has a solution 

for c G £+. However note that when Ni ^ 00 the corresponding set 

will become dense on the unit circle. Therefore = HNez;]. Using this we 


®An irrational frequency is an angle Att for which A is an irrational number. 
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have the following lemma, proved in the appendix, which is a generalization to the 
multivariable case of Proposition 6 in [49]. 

Lemma 6.4. For any c € £+ there exist an such that c G £+(N) for all 
min(N) > Nq. 

This shows that for each c G £+, the problem of minimizing (6.3b) over Q G 
^4.(N) does in fact have a solution for large enough values of N. Interestingly, the 
lemma is equivalent to limjjji„(N)_).Qo ^^(N) = £+. 

Proof of Theorem 2.6. Let Q and Qn be as in the statement of the theorem. 
Choose a c G £+ and a, P G \ {0} and fix Nq in accordance with Lemma 6.4. 
Throughout the rest of this proof we only consider min(N) > Nq, which means that 
an optimal solution Qn exists. Moreover, in the proof we need the following result, 
which is proved in the appendix. 

Lemma 6.5. The sequence (Qn) is bounded in 

Since (Qn) is bounded, there is a convergent subsequence, call it (Qn) for con¬ 
venience, converging in the norm to some function Qoo- Since (Qn) is a 

set of continuous functions, this means that the convergence is in fact uniform and 
hence Qoo is a continuous function. Now since i) the convergence is uniform, ii) Qoo 
is continuous, and iii) the grid points become dense on as min(N) goes to infinity, 
we obtain Qoo(e*®) > 0 for all 6, and hence Qoo belongs to \ {0}. 

It remains to show that Qoo = Q- This will be done by proving that ||Qoo~Q||oo < 
e for all e > 0. To do this, fix a Q G *P+ and consider Q -|- rjQ, which belongs to 
for all r] > 0. By simply adding and subtracting rjQ, the triangle inequality gives 

IIQoo ~ Qlloo ^ ''lIlQIloo + IKQoo VQ) ~ Qlloo- (6-4) 

We want to bound the second term. To this end, note that 

Jp(Q + vQ) - Jp(Q) = {c,m) - / -P fog ( 1 + ^ ) dm, 

JT-i \ Q J 

and, since the integral is nonnegative, we obtain 

Jp(Q Tr/Q) < Jp(Q)-I-r?(c,( 7 ). (6.5) 

The same holds for Jp, i.e., Jp(Qn +iiQ) ^ Jp(Qn) +» 7 (c, g). By optimality we also 
have Jp(Qn) < Jp(Q + ilQ) < oo for all r] > 0, and hence 

Jp(Qn + lyQ) < Jp(Q + jyQ) + ??(c, g). (6.6) 

Now, since Qn + vQ Qoo + vQ ^ we know that, for large enough values of 
min(N), we have Qn + ? 7Q G *P+. Therefore, the left hand side of (6.6) is guaranteed 
to be well-defined for all values of min(N) larger than this value. We can thus take 
the limit on both sides of (6.6) to obtain 

Jp(Qoo + vQ) < Jp(Q + hQ) + »7(c, q), 
which together with (6.5) yields 

Jp(Qoo + ?7Q) < Jp(Q) + 2??(c,(7). (6.7) 

Now consider the sets = {Q G | Jp(Q) < Jp(Q) + <5}. Since the Hessian at 
the optimal solution is positive definite we have n5>o = {Q}- Therefore, it follows 
from (6.7) that 77 > 0 can be chosen so that ||(Qoo + vQ) ~ Qlloo < £ for any £ > 0. 
Consequently, by selecting rj sufficiently small, we may bound (6.4) by an arbitrary 
small positive number. Hence Qoo = Q. □ 
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7. Application to system identification. The power spectrum of a signal 
represents the energy distribution across frequencies of the signal. For a multidimen¬ 
sional, discrete-time, zero-mean, and homogeneous® stochastic process {y(t)}, defined 
for t € the power spectrum is defined as the nonnegative measure d/i on whose 
Fourier coefficients are the covariances 


Ck 



In one dimension the singular part of the measure represents spectral lines, and if the 
absolutely continuous part is also rational, $ = P/Q, one can use spectral factoriza¬ 
tion to determine the filter coefficients for an autoregressive-moving-average (ARMA) 
model which, when feed with white noise input, reproduces a stochastic signal with 
the same power distribution as $. Therefore the one-dimensional rational covariance 
extension problem can be used for system identification [50]. 

With the theory developed in this paper we can estimate rational spectra in higher 
dimensions. However spectral factorization is not in general possible when d > 1 
[21]. For d = 2, Geronimo and Woerdeman have established conditions for when 
it is possible to factorize a given trigonometric polynomial as a sum-of-one-square 
[35, Thm. 1.1.1]. These includes a non-trivial rank condition on a reduced matrix 
of Fourier coefficients, which we shall call Fred, but also gives an explicit algorithm 
for obtaining the factors in cases when it is possible. Nevertheless, in the following 
example we will illustrate how the theory could be used in the case when covariances 
and cepstral coefficients comes from a rational, factorizable spectrum. 

We consider a 2D recursive filter with transfer function 

^ EkeA+ 

a(e*®i,e*^2) EkeA+’ 


where A+ = {(fci, ^2) € I 0 < fci < 2,0 < ^2 < 2} and the coefficients are given by 
^(^1,^2) “ ^ki+i,k 2 +i and kC[ki,k 2 ) ~ ■^ki+i^k 2 +ij where 



■ 0.9589 

-0.0479 

0.0959 


■ 1.0000 

0.1000 

0.0500 ■ 

B = 

0.0959 

0.0479 

0.0959 

, ^ = 

-0.1000 

0.0500 

-0.0500 


-0.0959 

0.0479 

0.1918 


0.2000 

-0.0500 

-0.1000 


Then the corresponding spectrum is given by 




P(e*' 


0 




and hence the index set A of the coefficients of the trigonometric polynomials P and 
Q is given by A = {(^1,^2) G | |A:i| < 2, \k 2 \ < 2}. 

We approximate the continuous problem with a discrete one in accordance with 
Theorem 2.6. The two-dimensional spectrum <I) is evaluated on a grid of size 30 x 30, 
and shown in Figure 7.1. The trigonometric polynomials corresponding to the true 
spectrum are shown in Figure 7.2. Its covariances and cepstral coefficients are com¬ 
puted, and a spectrum is then estimated by (unregularized) covariance and cepstral 


^Homogeneity implies that covariances Ck := E{y(t + k)j/(t)} are invariant with “time” t S Z'*. 
Prom this it is also easy to see that c_k = Ck- 
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Fig. 7.1. The true spectrum. 




(a) The true polynomial P. 


(b) The true polynomial Q. 


Fig. 7.2. The spectrum of the system 


matching along the lines of Theorem 2.4. The problem is solved numerically using 
CVX, a Matlab package for solving disciplined convex programming problems [36,37], 
and the resulting spectrum is shown in Figure 7.3a. The relative error^ is shown in 
Figure 7.3b. As seen from the relative error, we recover the true spectrum with good 
accuracy. For the ME solution, the resulting spectrum and relative error is shown in 
Figure 7.4. 

For system identification we are now interested in factorizing the two rational 
spectra as a sum-of-one-square, if possible. To check factorizability for the two solu¬ 
tions, we apply the rank condition from [35, Theorem 1.1.1], which requires that the 
corresponding submatrix Fred S should be of rank four in both cases. However, 
such a matrix is generically full rank and we have to study the singular values in order 
to determine the numerical rank. 

To illustrate this issue, in Figure 7.5 we plot the singular values of Fred for the 
respective polynomials. Figure 7.5b shows the singular values corresponding to the 
solution Qtrue p computed with the true polynomial P as prior (cf. Theorem 2.1 and 
Section 3.2). This solution, as well as the solution obtained by covariance and cepstral 
matching, gives the exact spectrum back, up to numerical errors, and hence should 
be factorizable. For both these solutions we can also observe a significant decrease in 


^Let the relative error between two functions >I>true and ‘I’est be the point-wise evaluation of 

|3>true - $est|/4>true. 
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(a) Estimated spectrum. (b) Relative error. 

Fig. 7.3. Spectrum estimated with covariance and cepstral matching. 




(a) ME-spectrum. (b) Relative error. 

Fig. 7.4. The ME-estimation and relative error to true spectrum. 


size between the fourth and the fifth singular values in Figure 7.5b. This indicates 
that the matrices in fact have numerical rank four, and spectral factorization is thus 
possible. Performing the spectral factorization on the solution with covariance and 
cepstral matching gives polynomials with coefficients 



■ 0.9589 

-0.0479 

0.0959 


■ 1.0000 

0.1000 

0.0500 ■ 

^est — 

0.0959 

0.0479 

0.0959 

: -^est 

-0.1000 

0.0500 

-0.0500 


-0.0959 

0.0479 

0.1918 


0.2000 

-0.0500 

-0.1000 


which agree completely with the true coefficients. 

For the ME spectrum on the other hand there is no guarantee that it will be 
factorizable. In general there is a priori no reason why spectral factorization should 
be possible. However, in Figure 7.5b we observe a decrease in size between the fourth 
and the fifth singular values also for the ME solution <i>ME = 1/Qme, although this 
decrease is significantly smaller than for the other polynomials. If for the moment 
we assume that the rank condition on Fred is actually (approximately) satisfied and 
apply the factorization algorithm of [35], we obtain the coefficients 


^ME — 


1.0317 

-0.1881 

0.2872 


0.1423 

-0.0173 

-0.0570 


-0.0251 

-0.1252 

-0.2597 
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(a) Singular values of Fred for different P. (b) Singular values of Tred for different Q. 

Fig. 7.5. The singular values of the reduced covariance matrix. 

for the possible spectral factor ome of Qme- Forming the corresponding true Q, 
namely IomeP, and comparing it with QmEj we obtain a relative error of up to 10% 
with respect to Qme- We leave the question whether this is a reasonable approxima¬ 
tion to a future study. Note also that if the ME spectrum is factorizable, the factors 
are given directly from the covariances by the Geronimo and Woerdeman algorithm. 
However if this is not the case, rational covariance extension will still give a rational 
spectrum. An important open question related to this, and suggested by the above 
analysis, is whether the solution can be tuned by an appropriate choice of P so that 
the rank condition is satisfied, and hence factorization is possible. 

8. Application to image compression. Since the expression (1.2b) is deter¬ 
mined by a limited number of parameters, this approach enables compression of data. 
Moreover, the smoothness of the parameterization will facilitate tuning to specifica¬ 
tions. Therefore we apply the two-dimensional circulant RCEP to compression of 
black-and-white images. Compression is achieved by approximating the image with a 
rational spectrum, thereby using fewer parameters. We compare the ME spectrum to 
the solution resulting from regularized covariance and cepstral matching. By choosing 
rii Ni, n 2 -€1 N 2 , where Ni and N 2 are the dimensions of the image, we obtain a 
significant reduction in number of parameters describing the image. 

A seemingly straight-forward way is to compute the covariances and cepstral 
coefficients directly from the image, and then use these to compute the spectrum. 
However, if the discrete spectrum is zero in one of the grid points, the (discrete) 
cepstrum is not well-defined. Hence simultaneous covariance and cepstral matching 
cannot be applied. Therefore we transform the image, denoted by 4', using $ = e'^. 
Since 'k is real, $ is guaranteed to be real and positive for all discrete frequencies, 
and is obtained as ^' = log$. We then compute (1.1) and (2.3) and obtain the 
approximant $ from Theorem 5.7. Here we use the real sequences of covariances and 
cepstral coefficients obtained by extending the image by symmetric mirroring (i.e., 
using the discrete cosine transform [61, Section 4.2]). However, the covariances and 
cepstral coefficients of $ can also be computed as the inverse 2D-EET of and dt 
respectively. 

Moreover, note that a ME solution of the same maximum degree as a solution 
with a full-degree P have about half the number of parameters. To compensate for 
this, we let the degree of the ME solution be a factor y/2 higher (rounded up), in 
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Fig. 8.1. A simplistic test image. Each black or white square is 128 X 128 pixels. 








Fig. 8.2. Compressions of the simple image shown in Figure 8.1. The top row shows compres¬ 
sion with regularized covariance and cepstral matching, where A = 10“^, and the bottom row shows 
compression with the maximum-entropy solution. In all cases ni = n 2 , and the pair of compressions 
in each column have approximately the same number of parameters, namely rime ~ \/2nceps. 


order to get a fair comparison. 

8.1. Compression of simplistic images. To better understand the different 
methods we first perform compression on a simple image of only black and white 
squares. The original image is shown in Figure 8.1 and various results are shown in 
Figure 8.2. Figure 8.2a, shows that, if too few coefficients are used, the compression 
cannot represent the harmonics present in the image, regardless of the use of a non¬ 
trivial P. A visual assessment of the result shows that 8.2e clearly outperforms 8.2a, 
and that 8.2f is still slightly better than 8.2b. However 8.2c and 8.2d are better than 
8.2g and 8.2h, respectively. In order to more objectively assess the quality of the two 
different compression methods, we also compute the MSSIM value of the compressed 
images. This is a measure, taking values in the interval [0,1], for evaluating quality 
and degradation of images, for which 1 means exact agreement [72]. A plot of the 
MSSIM value for compressions of different degree is shown in Figure 8.3. However 
note that this measure does not agree completely with the visual impression of all im¬ 
ages. Most notably, the measure gives a higher value to the grey image in Figure 8.2a 
than the image with structure in Figure 8.2e. 
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Fig. 8.3. MS SIM values of different compression levels, plotted against n for the compression 
with cepstral matching. Hence the corresponding ME compression has coefficients. 



(a) Original image. (b) Cepstral matching, n = 30 (c) ME solution, n = 45. 

and A = 10“^. 

Fig. 8.4. Compression of the Shepp-Logan phantom, with a compression rate of 97%. 



(a) Original image. (b) Cepstral matching, n = 60 (c) ME solution, n = 85. 

and A = 10“^. 


Fig. 8.5. Compression of the Lenna image, with a compression rate of about 97%. 

8.2. Compression of real images. We now apply the methods to some more 
realistic images. In the first example, shown in Figure 8.4a the original image is the 
Shepp-Logan phantom often used in medical imaging [69], of size 256 x 256 pixels. 
In Figure 8.4b a compression using covariance and cepstral mathing is shown, where 
ni -I- 1 = n2 -h 1 = 30. Hence this image is described by 2 • 30^ = 1800 parameters. 
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compared to the original 256^ = 65536 parameters, which corresponds to a reduction 
in parameters of about 97%. We also compute an ME compression, with degree 
ni + 1 = n 2 + 1 = 45 « ■ 30 which is shown in Figure 8.4c. 

The second example is a compression of the classical Lenna image, often used in 
the image processing literature. The original image, shown in Figure 8.5a, is 512 x 512 
pixels. For regularized cepstral matching we set ni + l = n 2 + l = 60, corresponding 
to a compression rate of about 97%, and the result is shown in Figure 8.5b. The ME 
compression, computed with ni + 1 = n 2 + 1 = 85 « ■ 60, is shown in Figure 8.5c. 

The MSSIM values for these compressions are shown in Table 8.1. They seem to 
agree with the visual impression. Interestingly the compression with cepstral matching 
is better for the Shepp-Logan phantom. However, in the Lenna image neither of the 
methods outperform the other. The ME compression has more ringing artifacts, but 
it is less blurred than the cepstral compression. We believe that this is related to 
the fact that if you have relatively few sharp transitions in pixel values, which is the 
case in Figure 8.1 and Figure 8.4a, placing both poles and zero close to each other 
can achieve this transition efficiently and thus give better quality on the compressed 
image. However when this is not the case, as with the Lenna image, the trade-off 
between having spectral zeros or matching higher frequencies is more complex. 

Table 8.1 

MSSIM-values of different compression techniques, on the two test images. 


Shepp-Logan 

Lenna 

Compression 

MSSIM-value 

Compression 

MSSIM-value 

Cepstral 

0.8690 

Cepstral 

0.7451 

ME 

0.7044 

ME 

0.7489 


Similar methods have previously been used for compression of textures [18,59], 
where, instead of a scalar two-dimensional moment problem, a one-dimensional vector 
problem is considered. Here the image is modeled by a periodic stochastic vector pro¬ 
cess rather than a two-dimensional random field, leading to a discrete vector moment 
problem akin to the one presented in [49]. This is connected to the circulant moment 
problem considered in Section 2.2 and to modeling of reciprocal systems [17,47]. 

Appendix A. 

In this appendix we provide the proofs that have been deferred in the main text. 
Some of the proofs use general properties of multidimensional trigonometric polyno¬ 
mials, summarized in this lemma. 

Lemma A.l. For all P G we have i) jpfci kj ^ Po o o,nd ii) [[Hjloo < 
lAjlblloo. 

Proof. The fact that [pkl = I/t^ \P\dm = Pq implies i). 

Next we note that P has [Aj coefficients, and hence 

ll^’lloo < sup l^'kl ^ |A|lblloo, 


which proves ii). □ 

Proof of Lemma 3.1. To show lower semicontinuity of 

^p{Q) = {c,q)+ [ -PlogQdm 
JTd- 


we note that (c, q) is continuous and hence only the integral needs to be considered. 
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Fix any Q G \ {0}. From [ 68 , p. 223] we know that it is log-integrable. 
Moreover, let (Q„) be a sequence of trigonometric polynomials in ^ 4 . \ {0} that 
converges to Q in L°°(T‘^). We know that Q is bounded, and, since the convergence 
Qn -G Q is uniform, we must have M := sup„{maxe[(5„]} < 00 , and thus 0 < Q/M < 
1 and 0 < Qn/M < 1 for all n. Moreover, lim„_>oo — log((5n/M) = — log(Q/M) in 
extended real-valued sense. Since — log((5„/M) > 0, by Fatou’s Lemma [67, p. 23], 
we have 



dm < lim inf 

n—>-oo 




dm. 


Since (Qn) is an arbitrary sequence, the functional is lower semicontinuous in Q. 
Moreover, since Q is also arbitrary it follows that Jp is lower semicontinuous on 
\ { 0 }. □ 

Proof of Proposition 4-2. Let ki,k 2 ,k 3 C A be three linearly independent index 
vectors. First note that the trigonometric polynomial (5(e®®) = ~ -|- 

e“*(k<’>®))/2) is nonnegative and Q{e‘°) = 0, hence Q € 9fP+. Next we will show that 
ffd Q~^dm{6) is finite. By the variable change cf = AO, where A G is selected 
to be invertible and with £th row equal to k( for £ = 1,2, 3, the integral becomes 




Q 


dm{6) = 


det(A) 


-1 


'A(T-i) Efcl(l - COsicfi)) 


dm{(p). 


where the set A(T'^) = {AO \ 0 G T"^}. Due to the periodicity of the integrand, the 
integral is bounded by 


£ d<f)id<j)2d(f)3 

"3 ELi(1 “ cosicfe)) 


for some constant k that depends on A and d. This bound is finite [41,44], and 
therefore the proposition follows. □ 

To prove Theorem 4.4, we need the following lemma. 

Lemma A.2. p is a bijective map. 

Proof. By Corollary 2.3, is injective, since there is a unique minimizer of (2.2) 
over all Q G fp+. Hence there is at most one q corresponding to a certain c, proving 
injectivity. Surjectivity also follows from Corollary 2.3. We fix a P € fp+ and simply 
note that there exist a unique solution for all c G £+, given hy q = {f^)~^{c). □ 

Proof of Theorem 4 - 4 - la the proof of Theorem 2.1 we saw that 9^Jp((5; 6Q) > 0 
for all nontrivial variations 6Q. Hence 


f p Wp(Q) 

dq, P dqidqu 


is positive definite. Next, we define the map x iPp —> {(rk)kgA G Cl^l |r_k = 

rk,k e A} = as 


7’k(c, <?) = Ck - [ ^dm. 

JTd 14 

By Corollary 2.3, 7 ( 0 , q) = 0 has a unique solution for each c € £+. Since dqA jdq = 
dfPjdq is invertible, the Implicit Function Theorem implies that q = (/^)~^(c) is 



28 


A. RINGH, J. KARLSSON, AND A. LINDQUIST 


locally a function and hence a local diffeomorphism. However, is a bijection 
(Lemma A.2) and therefore a (global) diffeomorphism. □ 

By Theorem 4.4, the function is a well-defined map. The proof of Theorem 4.5 
now follows along the same lines. 

Lemma A.3. g^ is a bijective map. 

Proof. Surjectivity of g‘^ on the image 0+ follows directly from definition. A 
straight-forward generalization of Lemma 2.4 in [14] shows that g‘^ is injective. □ 
Proof of Theorem 4-5. Let the map x > {(rk)keA G | r_k = 

Tk, k e A} = be given by 


tUp, q)=Ck- [ 


The Jacobian with respect to q is the same as (A.l). Hence q = g’^ip) is by the 
Implicit Function Theorem. Since (A.l) gives a positive definite Jacobian matrix, 


dp£ 


J(k~£,d) 


/fd 


Q 


-dm 


dehnes a invertible Jacobian. Hence p = {g'^)~^{q) is C^, so is a local diffeomor¬ 
phism. Since it is a bijection (Lemma A.3), it is a (global) diffeomorphism. □ 

Proof of Lemma 5.1. For any Q € *p+\{0}, logQ is integrable [ 68 , p. 223]. Since 
P e $+.o, P is not the zero-polynomial, hence, since a: log a: —>■ 0 as a; —>■ 0, PlogP is 
integrable and in fact continuous for all P € $+,o- Hence 




P log P dm — P log Q dm 


Ijd 



dm, 


and therefore we can rewrite the functional J(P, Q) as 


I{P,Q) = {c,q) - {'-f,p) + f PlogPdm- ( PlogQdm. 

JTi JT'* 

All terms in this expression are continuous, except possibly the last integral. However, 
following along the same lines as in the proof of Lemma 3.1, we can apply Fatou’s 
Lemma showing that J(P, Q) is lower semicontinuous. □ 

Proof of Lemma 5.2. To show that J“^(—oo,r] have compact sublevel sets, we 
proceed as in [50, p. 503] by first splitting the objective function into two parts 


Ji(P, Q) = (c, 9 ) - f PlogQdm and J 2 ( 0 ) =-( 7 ,p) + [ PlogPdm. 

The sublevel set consists of the (P, Q) € *P+,o x such that r > Ji(P, Q) + J 2 ( 0 ), 
and from Lemma 3.3 we have Ji(P,Q) > ejlQIloo + log||(5||oo, since Pdm = 1 
by (5.5). Next we show that J 2 (P) is bounded from below. We Hrst note that since 
P € *P+,o we have po = 1, and thus P is bounded away from the zero polynomial. 
Now, since xlog(a;) achieves a minimum > —00 on any compact set [0,a], PlogP 
must achieve a minimum > —00 on Calling this minimum Kp, we have 




P log P dm > 


Kpdm = Kp 
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To bound the term — ( 7 ,J 3 ) from below we note that 

(7.p) = \Pk\ < llvlloobkl < hlloolAllblloo 

keA keA keA keA 

and thus -{'y,p) > -|A||| 7 ||oo|b||oo = -|A||| 7 ||oo, since |boo|| = po = 1 by Lemma 
A.l. Hence there exist some p > —oo such that J 2 (-P) > P- From this we have 

r- p > Ji(P,Q) > ffllQIloo TlogllQIloo, 

so comparing linear and logarithmic growth we see that the set is bounded both from 
above and below. As before, since it is the sublevel set of a lower semicontinuous 
function it will be closed, and hence it is compact. □ 

Proof of Lemma 5.3. Consider the directional derivative of J in a point (P, Q) € 
^+,oX G in any direction {6P,5Q) such that P + eSP G *P+,o, and Q + e6Q G *P+ 
for all e G (0,a) for some a > 0. A quite straight-forward calculation yields 

SI{P,Q;SP,6Q) = {c,6q) - {'y,Sp) + J ^ 5Plog - 5Q^ dm. 

where we have used the fact, obtained from (5.5), that SPdm = 6po = 0, since 

Po = 1 is constant. Likewise, the second directional derivative becomes 

S‘^f{P,Q;SP,SQ)= J P^SP^-SQ^"^ dm, 

which is clearly nonnegative for all feasible directions and hence positive semi-dehnite. 
Thus the problem is convex. □ 

Proof of Lemma 6.4- First note that £+(N) C £+. To prove the lemma, it is 
sufficient to prove that any c G £+ belongs to £+(N) if min(N) is large enough. 

Let c G £+. From (3.4) there exists Kc > 0 such that 

(c,p) > KcIIpIIoo, forallpG$+. (A. 2 ) 

We want to show that (c,p) > 0 for any p G *P 4 .(N) \ {0}. Without loss of generality 
we may take ||p||oo = 1- Then \dP(e^^)/d0j\ < X^keA bil’ since P(e*®) > 0 in 
0 G Tn, it follows that P(e*®) > min(N) where A = X^keA ll^^lli- Therefore 

P P tt/S,! min(N) G ^+, and by using (A. 2 ) we get 

/ ’’’A ttA \ 

C.p) + Co . > Kc P oo-• 

mm(N) \ mm(N) J 

By selecting min(N) > 7rA(l -|- cq/kc), we obtain (c,p) > 0. Since p G $+(N) \ {0} 
is arbitrary, it therefore follows that c G ei+(N). □ 

Proof of Lemma 6.5. For a fixed Q G Cp+ we have limjjji„(N)_,.Qo Jp (Q) = Ip{Q), 
since the sums in (6.3b) are Riemann sums converging to (6.3a). Hence we can define 
L := sup^ Jp((5) < oo. Also, by optimality, oo > Jp((5) > Jp(Qn) for all values of 
N and also oo > Jp(Q) > Jp(Q). Using this and Lemma 3.3 we obtain 

L > Jp(Q) > Jp((5n) > EnIIOnIIoo — ||-P||l|| log(QN)||oo 
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for all values of N. In accordance with (3.5), we can choose £n := n^/|A|, where 
is the minimum value of (c, (/n) on the compact set {Q € | ||g||oo = !}• If we 

can show Kc '■= infN > 0, we can choose £ := Kc/|A| < £n for all N, so that 

L > £||Qn||oo - llA’Illll log((5N)||oo- 

Then comparing linear and logarithmic growth this implies that (Qn) is bounded. 

To show that Kc > 0 hrst note that for every finite value of min(N) we have 
> 0. Now assume infN = 0. Then there must exist a sequence (q^) such that 
(c,9 n) —> 0 as min(N) —>• oo, where € *P+(N) and H^nIIoo = I- Now, since every 
( 7 n is a vector in the constraint ||g||oo = 1 defines a compact set. Hence there 
is a subsequence, also indexed with N, so that q* := limmin(N)-).oo ^Zn i® well-defined 
and || 9 *||oo = 1- Then {c,q*) = 0. However, since c € £+ and q* G this implies 
that q* = 0, which contradicts ||9*||oo = 1- Hence Kc > 0, as claimed. □ 
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